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1  Introduction 


Measurement  errors  are  the  differences  between  the  actual  desired  values  and  the  ob¬ 
served  values.  In  the  real  world,  it  is  usually  very  difficult  to  obtain  exactly  the  “true” 
values.  Instead,  one  may  only  get  the  observed  values  that  are  related  to  the  true  values 
through  the  measurement  errors.  Extra  care  must  be  taken  to  deal  with  the  mesure- 
ment  errors  in  the  analysis  because  the  data  become  more  noisy  and  error-prone  when 
the  measurement  errors  are  taken  into  consideration. 

Measurement  error  models  are  those  in  which  one  or  more  of  the  explanatory  vari¬ 
ables  cannot  be  observed  directly  and  are  measured  with  error.  Fuller  (1987)  gave 
a  comprehensive  introduction  to  measurement  error  models.  Carroll,  Ruppert,  and 
Stefanski  (1995)  discussed  nonlinear  measurement  error  models  and  the  corresponding 
approaches. 

In  this  paper,  we  are  concerned  with  a  problem  of  selecting  a  treatment  that  has 
the  strongest  relationship  between  an  explanatory  variable  and  the  response  variable  in 
a  linear  measurement  error  model.  For  the  general  approaches  to  statistical  selection 
problems,  references  can  be  made  to  Bechhofer,  Santner,  and  Goldsman  (1996)  and 
Gupta  and  Panchapakesan  (1996). 

The  following  is  the  measurement  error  model  that  we  are  interested  in.  Suppose 
there  are  k  treatments  IR,  z  =  1, . . . ,  k  and  n  observations  from  each  treatment.  For 
each  treatment  Ilj,  i  —  1, . . . ,  k  and  each  observation  j  =  1, . . . ,  n,  we  have  the  following 
model: 


Yij  —  A )i  +  PliXij  +  tij ,  Wij  —  Xij  +  Uij.  (1) 

For  each  i  =  1  ,...,&,  the  intercept  floi  and  the  slope  flu  are  both  unknown,  and 
{(Xij,Uij,€ij),  1  <  j  <  n}  are  assumed  independent  with  mean  (0,0,0)  and  covariance 
diag (<Jxxi,auui,aeei),  where  diag(<7XXj,  ouui,  creei)  refers  to  a  3x3  matrix  whose  diagonal 
elements  are  aXXi,  aUUi ,  and  aeu  while  the  rest  of  the  elements  are  all  0.  We  assume  that 
for  each  i,  auui  is  known. 

We  are  interested  in  the  relationship  between  the  explanatory  variable  X  and  the 
response  variable  Y .  However,  cannot  be  observed  directly,  instead  we  observe 
which  is  X^  mixed  with  a  linear  error  term  U fl.  An  interesting  question  here  is:  how  to 
select  the  treatment  that  has  the  strongest  relationship  between  the  explanatory  variable 
X  and  the  response  variable  V? 

In  this  selection  problem,  the  slope  flu  is  important.  It  is  the  rate  of  the  change  in 
the  mean  value  of  Y  with  respect  to  X  and  therefore  a  measurement  of  the  strength  of 
the  relationship  between  X  and  Y.  Gupta  and  Lin  (1997)  studied  a  selection  problem 
in  which  the  selection  criterion  was  to  select  the  one  that  has  the  largest  slope  under 
this  modeling  setting. 
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However,  sometimes  the  relationship  between  X  and  Y  can  take  opposite  directions 
and  the  k  slopes  can  have  different  signs.  In  other  words,  some  slopes  may  be  negative 
while  some  are  positive.  If  we  stick  to  the  criterion  of  selecting  the  largest  slope  in 
this  situation,  then  we  are  essentially  excluding  the  negative  slopes  and  only  considering 
those  positive  ones.  Prom  this  point  of  view,  it  is  necessary  to  broaden  the  scope  of 
our  consideration  and  generalize  the  selection  problem  studied  in  Gupta  and  Lin  (1997). 
In  this  paper,  we  studied  the  problem  and  derived  a  selection  procedure  of  which  the 
selection  criterion  is  to  select  the  treatment  that  has  the  largest  absolute  value  of  the 
slope. 


A  treatment  is  said  to  be  the  best  if  the  absolute  value  of  the  slope  \/3u\  is  the 
largest,  i.e.,  \fiu\  =  maxi <y<^  \Pij\-  Otherwise  the  treatment  is  said  to  be  non-best.  The 
selection  goal  is  to  select  the  best  treatment. 

Let  Q,  =  {fix  =  (/?n>/?i2) . . . ,  Pik)\Pu  £  R,  i  =  1  ,...,&}  be  the  parameter  space 
and  a  =  (ax,...,  a*)  be  an  action,  where  a,  =  0  or  1,  i  =  1, . . .  ,k.  When  action  a 
is  taken,  a*  =  1  means  treatment  n,  is  selected  as  the  best  and  a,  =  0  means  II*  is 
excluded  as  the  non-best.  For  i  =  1, . . . ,  k,  let  Wi  =  (Wn, . . . ,  Win),  Yi  =  (Yn, . . . ,  Yin ), 
W  =  (Wx, . . . ,  Wk),  and  Y  =  (Fx,...,Yfc)-  Let  x  be  the  sample  space  generated  by 
(W,  Y)-  Since  the  true  order  of  |/?n|, . . . ,  \fiik\  is  unknown,  we  denote  |/A[i]|  <  \Pi{2)\  < 
•••  <  \Pi[k]\-  For  simplicity,  we  assume  that  |  -  |/?i[*-i]|  =  25  >  0,  where  S  is 
unknown. 

A  selection  rule  d( w,  y)  =  (di(w,y), . . .  ,<4(w,  y))  is  a  mapping  defined  on  x,  where 
di( w,  y)  is  the  probability  that  given  W  =  w  and  Y  =  y,  n,  is  selected  as  the  best.  Also, 
£i=idi(w,y)  =  1,  for  all  (w,  y)  G  X-  In  other  words,  only  one  of  the  k  treatments  will 
be  selected  as  the  best. 


We  consider  the  following  loss  function: 


if  the  best  treatment  is  not  selected, 
if  the  best  treatment  is  selected. 


(2) 


2  Formulation  of  the  Selection  Procedure 


Before  we  develop  a  selection  procedure  for  this  problem,  let  us  first  look  at  the  estima¬ 
tion  of  these  slopes.  Fuller  (1987)  has  shown  that  the  ordinary  least  square  regression 
analysis  will  not  work  in  this  case  because  the  ordinary  regression  slope  estimate  is 
always  biased  toward  0.  We  will  use  the  moment  estimators  instead. 

The  population  moments  of  (Wy,  Y,j)  satisfy 

(k'wii  f^yi)  =  {EWij,  EYij)  =  (0,Po i)i  (3) 
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and 


{Pwwii  &wyi >  &yyi) 

=  {VarW^CoviyV&Yi&VarYij) 

(&xxi  d~  ®uui  i  Pli^xxi )  A-i  @xxi  4~  ^ tti ) 4  (4) 

The  sample  means  (Wj,Y)  and  the  sample  covariances  (Swwi,  Swyi,  Syyi),  where,  for 
example, 


S**  =  — r  E(^i  "  ^)(Y;  -  Yi), 

lb  -L  « _ 


j= 1 


(5) 


will  be  the  basis  of  our  selection  procedure. 

We  estimate  the  parameters  by  replacing  the  unknown  population  moments  with 
their  sample  moments.  Note  that  axxi  should  be  positive.  Otherwise  Xij  can  take 
only  one  value  for  all  j  =  l,...,n  and  there  is  no  point  to  study  the  quantitative 
relationship  between  and  Yj.  Therefore,  estimator  axxi  should  be  positive  as  well. 
Let  <JXxi  —  SWwi  ui  when  Swwi  cfuui  Is  positive,  otherwise  let  <j xxi  =  ^yyi ^wyi ■  Also 
define 


(Auttfi  ^ uui ) 


q- 1  c 

^wyi^yyi ) 


wyi ? 


if  S^rwj  <^UUi  0) 

otherwise. 


(6) 


We  construct  selection  procedure  d„(w,y)  =  (dln(w,y),  d2n(w,  y), . . .  ,<4„(w,y))  as 
follows: 


<4(  w,y) 


if  | Ai|  =  max!<j<k  | Ail, 
otherwise, 


(7) 


when  W  =  w  and  Y  =  y  are  observed.  In  other  words,  the  treatment  associated  with 

the  largest  estimated  absolute  value  of  the  slope  maxi<j<fc  |A*|  will  be  selected  as  the 
best. 


3  Performance  of  the  Selection  Procedures 

We  now  study  the  performance  of  the  selection  procedure  developed  in  (7).  A  measure 
of  the  performance  of  this  decision  rule  is  the  probability  of  making  a  wrong  decision 
when  using  this  rule.  Since  in  this  case  the  loss  function  is  the  0-1  loss,  the  probability 
of  making  a  wrong  decision  is  the  expected  risk  of  the  proposed  procedure.  We  would 
like  the  probability  of  making  a  wrong  decision  to  be  as  small  as  possible. 
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Denote  Pn  to  be  the  probability  measure  generated  by  the  random  observations 
(W,Y),  and  for  each  (w,  y)  G  Xi  let 


i*  =  {t|  |A»I  =  max  |jSy|  =  |^i[fc]|,  t  =  1, . . . ,  A:}, 


(8) 


and 


*n  =  {*1  |Ai|  =  max  |/5ij|,  i  = 


(9) 


Then,  the  expected  risk  of  the  proposed  selection  procedure  is 

£{-^f(^4(w,y)) 

=  E  E  = 

t=l 

k  k 

_  /  v  ^  —  ^5  ^wwi  &uui  ^  ^  j  ^wwj  Y  j 

i=lj=lj¥i  * 


(10) 


k  k 


^ —  •?>  ^i/jun  ^  o  } 

z=1j=1J^z 


A;  k 


+  E  E  ^n{«*  =iJ*n=  i»  -  0-uuj  <  ^-} 

*=1  i=l  J#i 

<  E  E  *n{|A<|  _  I  At  I  >  Atm  -  ^uut  >  Atoj  -  <7uu.j  >  ^p} 


O"  <T> 


i=l  i=lj# 

+  EE  Pn{\P\j\-\PiA>6,Swwi-auui>^,Swwj-auuj>^} 

i=l  j=lj& 

+  E  E  A{A«n  -  auui  <  pp} 

z=l  j=lj^i 

+  E  E  A{  Atoj  —  0"uui  <  „ J  } 

i=1i=bi#t 

<  E  E  A{|At|  -  | At |  >  swwi  -  auui  >  pp } 

Z=1  j=l 


A:  A: 


+  E  E  At{|Ajl  —  I  Ail  >  Au>j  —  0"uuj  >  AfA 

t=l  j=lj& 

k  a  ■ 

+2k  £  Pn{Swmi  —  cruui  <  — pA 

t=i  ^ 

k  ^ 

<  2k  Y  At {| At  -  Atl  >  <*,  Pyjwi  -  CFuui  >  AA} 

t=l  ^ 
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+2k'£Pn{Swwi-auui<?y-}. 

i— 1 


Prom  above  we  observe  that  it  suffices  to  analyze  the  performance  of  the  followings 
two  sequences: 

Pn{Swwi  -  <7uui  <  - f  ’},  Pn{\Pu  ~  Pli\  >  8,  Swwi  -  (Tuui  >  ^}.  (11) 


Definition  1  A  sequence  of  selection  procedures  {dn(w, y)}£Li  is  said  to  be  asymp¬ 
totically  optimal  of  order  en  if  L(0,dn( w,  y))  =  0(en),  where  en  is  a  sequence  of 

positive  numbers  such  that  lim^oo  en  =  0. 

The  large  sample  performance  of  the  derived  selection  rule  dn(w, y)  will  be  analyzed 
in  two  situations. 


3.1  When  The  a-th  Moment  Exists  ( a  >  2) 

In  this  subsection,  we  suppose  that  the  a-th  (a  >  2)  moments  of  (A^-,  Uij,  t{j)  exist, 
that  is, 

E\Xi;j\a  <  oo,  E\Uij\a  <  oo,  E\eij\a  <  oo.  (12) 

We  will  show  that  the  expected  risk  of  the  proposed  selection  procedure  converges 
to  0  at  the  rate  of  o(n~^2~^). 

We  introduce  some  useful  lemmas.  The  first  lemma  is  well  known,  a  similar  result 
can  be  found  in  Baum  and  Katz  (1965). 

Lemma  1  Let  Xi, . . . ,  Xn  be  independent  random  variables  with  mean  0.  Suppose  for 
a  fixed  number  cc  >  1,  E\Xi\a  <  oo,  for  *  =  1, . . . ,  n,  then  for  any  e  >  0, 

P{li:Xj/n|>e}  =  o(n-<“-»).  (13) 

As  a  consequence  of  Lemma  1,  we  have 

Lemma  2  Let  Xi, ,  Xn  be  independent  random  variables  with  mean  EXi  =  n  and 
variance  VarAj  =  a2,  for  i  =  l,...,n.  Let  A  =  and  S2  =  ^rrE(Ai  —  A)2. 

Suppose  for  i  =  1, . . .  ,n  and  a  fixed  number  a  >  2,  E\Xi\a  <  oo,  then  for  any  e  >  0, 

P{\Sl  -  a2\  >e}  =  „(„-<»/»-«).  (14) 

Proof. 

-  <^1  >  e}  =  (15) 

/  i  ft  1 

1=1 
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<  |>§} 


=  ^£W-(/*2+^))l> 

I  l  Tl  Zi 

+P{\X*  -  H2\  >  e-  -  L2} 

=  /’{|i£(*?-(/‘2  +  *!))l>j} 
+p{|*2-Vl>  i} 

:=  h  +  h. 


for  n  large  enough,  that  is,  when  n  >  max  (2,  [—]  +  1),  we  have  rL^\  >  \  and  |  —  £ a 2  > 
From  Lemma  1,  we  have 


h  =  P{ |i£(x2-(„2  +  ff2))|>  J} 
=  o(n-»/2-') 


and 

72  =  P{|X2-/,2|>|} 

=  P{|(X  +  //)(X-/z)|  >  -  and  (X  +  fj)  >  (2/x  +  l)} 

+P{|(X  +  A*)(X  -  At)l  >  |  and  (X  +  /*)  <  (2/z  +  1)} 

<  P{(X-/x)>l}  +  P{4(2M+l)|(X-^)|>e} 

=  o(n_^Q_1^). 

From  Lemma  2,  we  can  see  that 

P{Swwi  -  auui  <  ^p}  =  P{5^-aw^<-^} 

=  oCn-^2"1)). 

Moreover, 

Pn{|Ai  ~  Ai|  >  ^  Swwi  -  Ouui  >  —-} 

=  P«{L  ^ 

^ wwi  ®uui  " 


(16) 


(17) 


(18) 


(19) 
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<  Pn{\Swyi  -  fin(SWWi  -  cruui) |  >  8-~- } 


=  'L(W‘>  -  W,)(Y„  -  ?,)  -  M-±-(£(Wlj  -  Wtf  -  aml)\ 

n  i  .=1  n  i  j=1 


V 

3- 


>■5-2-} 


i=i 


n 


tA,:  E  ^ 


+A*r«*|  >  6 


<  Pn{\ - T  E  WM - ?/Wi|  >  S -f-} 

n  —  1  ^  n  —  1  6 

-  ^"7  -  j^-i  ^  ^ 
+p»{I-ttA<  E  -  -rrAiK*  + 

yi  -L  j  i  Tzi  u 

•—  J\  +  J2  +  </3- 


For  any  i  =  1, . . . ,  k,  {WijYij,j  =  1, . . . ,  n}  are  independent  random  variables  with  mean 
E(WijYij)  =  /3u<7xxi.  By  Holder’s  inequality, 

ElWaY^  <  ^E\W^E\Y^  <  00,  (20) 

therefore,  we  have 

Jx  =  Pn{  I-  EfWyltf  -  lhi<Tzxi)\  >  —8?f±}  (21) 

W  W  D 

=  o(n^_“//2_1^). 

Since 

ty*y  -  puW?  (22) 

=  wi^  -  pim 

—  Wiifoi  +  €"i  +  PliUi) 

=  faWi  +  eiWi  +  PuXiUi  +  hiUf, 

we  observe  that 

h  =  P„{|/Wi  +  mi  +  thiXiUi  +  puUf  -  -/3u<rUui\  >  — } 

n  n  6 

<  P„{|AiWl|  >  (23) 
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+P«{\Wi\  >  } 

+PAWuXiUi\  > 

+P»{|^nO?  -  iftiC'u.il  >  —4^1 

n  n  24 


} 


<  Pn{\PoiWi\> 


^  If  &xxi 

n  ~24~ 


} 


+Pn{|el|  >  ^ VM?}  +  Pn{,^i|  - 

+P»{\\fKiXi\  >  ^ -f-^)  4-  P„{lv^Pil 


1 


^  1  <■  - 


+P»{|A<(0?  -  ^)|  > 


I  Tl  1  O xxi  - 


n  24 


v.  ^  JT  ^22  ^ 

-  v~5it} 


Then  by  Lemma  1,  we  have 


Pn{|/W  > 


^  1  I'  xxi 


n  24 


}  = 


0(n-(a-i)))  if  ^0.  ^  Q, 

0,  if  A)i  =  0, 

=  o(n-(Q-1}), 


(24) 


Pn{|el|  >  =  °(n  (“  1})»  (25) 

a.flwii  >  (26) 

p.{iV5^l  £  (27> 

Afl^.1  >  =  »(»-(“-1)),  (28) 

a.{lft<(0?  -  -<r«„i)l  >  —  4^}  (29) 

71  24 
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(30) 


<  P„{|A,0?|  >  -  ;^.<W 

=  P»{\\[ihlUl\  >  ~ 

=  o{n-^~V), 

Therefore,  J2  =  o(n~(a~^).  Similarly, 

J3  =  Pn{\—pli  T,  Wjj  —  Plijo'xxi  +  &uui)\  > - 

Tt  j _ ^  ^  0 

=  ofn-W2-1*). 

Hence,  by  combining  the  above  arguments,  we  have  the  following  theorem. 

Theorem  1  The  selection  procedure  dn( w,  y),  defined  in  (7),  is  asymptotically  optimal 
with  a  convergence  rate  of  order  o(n~^2~^)  under  condition  (12).  That  is, 

&*&!,(£, d»{  w,y))  =  otn-W2-1)).  (31) 


3.2  When  The  Moment  Generating  Function  Exists 

In  this  subsection,  we  suppose  the  moment  generating  functions  of  {X?-,  [/?•,  e?  }  exist 
in  a  neighborhood  of  the  origin,  that  is,  for  —  T  <t  <T , 

£etx£  <  00,  EetuZ  <  00,  Eet£*  <  00.  (32) 


where  T  is  a  positive  constant. 

We  first  introduce  the  following  lemma,  which  can  be  found  in  Petrov  (1995). 

Lemma  3  Let  {Xi, . . . ,  Xn}  be  independent  random  variables  with  mean  EXj  =  0,  i  = 
1, . . . , n.  Suppose  there  exist  positive  constants  gi, ...  ,gn  and  T  such  that 

EetXi  <e9it2/2  (t  =  !,...,«)  (33) 


for  —  T  <t  <T.  Let  Gn  =  gh  then 


ni£*<l>*)< 

i~  1 


e-{x2l2Gn) 

e-(Tx/2), 


if  0  <  x  <  GnT, 
if  x  >  GnT. 


(34) 


The  following  lemma  clarifies  the  probabilistic  meaning  of  the  conditions  of  Lemma  3. 

Lemma  4  Let  X  be  a  random  variable  with  mean  EX  =  0.  The  following  two  assertions 
are  equivalent: 
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(35) 


(I)  There  exist  positive  constants  g  and  H  such  that 

Eetx  <  egt2/2  for  -  H  <  t  <  H, 

(II)  There  exists  a  positive  constant  T  such  that 

Eetx  <  oo  for  -  T  <  t  <  T.  (36) 

Proof.  It  is  clear  that  (I)  implies  (II).  We  now  prove  that  (II)  also  implies  (I).  If  (II) 
holds,  then  the  random  variable  X  has  the  moments  of  all  orders,  and  the  following 
relation  holds: 

log  Eetx  =  ^a2t2  +  o(t2)  (37) 

as  t  — »  0,  where  a2  —  EX2.  For  any  constant  g  >  cr2,  the  inequalities  \ogEetx  <  gt2/ 2 
and  Eetx  <  e 9*2/2  hold  for  all  sufficiently  small  t,  that  is,  (I)  is  true.  This  completes  the 
proof  of  Lemma  4.  As  we  can  see  in  the  proof,  we  can  always  set  g  =  2 a2. 

We  further  assume  that  the  4-th  moments  of  {Arfj,£/y,ey}  are  uniformly  bounded, 
that  is,  there  exists  a  positive  constant  C  such  that 

EXfj  <  C,  EUfj  <  C,  Et%  <  C.  (38) 

We  can  see  from  (38)  that  EWjj,  EY*  and  E(WijYij)2  are  all  bounded. 


We  analyze  Pn{Swwi  -  auui  <  first. 

r— S 

e 

N. 

I 

,? 

c 

IA 

to  f 

(39) 

< 

^ i ri { |  $wwi  &wwi  |  ^ 

'i 

2 

< 

C.d-EW'y-^il 

H  j= 1 

>n“1i} 

“  n  2J 

+PA l«?l  >  5  -  L2} 

Z  n 

=  P{\h±(wi  -  *«,„>)  |>j} 

n  j= 1  ^ 

+P{\W<\  >  \/|} 

:=  K\  +  AT2> 

for  n  large  enough,  that  is,  when  n  >  max  (2,  [^]  +  1),  we  have  >  f  and  §  -  \a2  > 
f  •  Since  for  j  =  1, . . . ,  n,  f?(Wy  -  awwi)  =  0  and  for  -T  /  2  <  t  <T/  2, 

Eem2i  <  Ee^x-+U^  <  E(e2^xhe2^)  =  E(e2^x2j)E(e2^)  <  oo.  (40) 
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By  Lemma  3  and  Lemma  4,  we  have 


Ki  =  p{\~J2(Wij  -  <W)I  >  7} 


r  c-(nV/32G«)j  if  e  <  2TG„/n, 
-  \e-(Te/8)”,  if  e  >  2TGn/n, 


where  Gn  is  twice  the  sum  of  the  n  variances  of  (W^  —  aWWi),  j 
(EWij,j  =  1, . . .  ,n)  are  bounded,  Gn  =  0(n).  Therefore, 


_ ,  n.  Since 


re-(nV/32G„)5 
|  g-(T€/8)n5 

0(e_c^in), 


if  e  <  2TG„/n, 
if  e  >  2TGn/n, 


where  is  a  positive  constant.  Similarly,  for  —  T  <t<T , 

EetWij  <  Ee^\  <  Ee^w^  <  00. 


^2  =  P{\Wt\  >  ^ 
0(e~c*K2n), 


where  c*K2  is  also  a  positive  constant. 

Next  we  consider  Pn{ \/3u  -  fiu\  >  8,  Swwi  -  auui  >  We  have 

Pn{\Pu  ~  Pii\  >  6,  Swwi  -  auui  >  ^p}  (45) 

2  ft<l-rrE WjYv -  ><s^} 

Tt  X  j  |  Tt  X  O 

+Pn{\-^1W,Yi  -  -i-Ai W?  -  — -—rPliVuuil  > 

71/  X  Tt  X  Tt  X  O 

+Pn{\—^—rPu  Y 2  Wij - —rrpliivxxi  +  Vuui)\  > 

Tt  X  j _ j  Tt  X  0 

Li  +  L2  +  L3. 

For  any  i  =  1, . . . , k,  {WijYij,j  =  1, . . .  ,n}  are  independent,  by  Cauchy-Schwarz’s 
inequality,  we  have,  for  — T/2  <  t  <  T/2, 


(vvjj+Yij) 


EetWi^  <  Ee\tw^\  <  Ee^^^  <  ^Ee^Ee^ 


i  <  OO. 
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(47) 


Besides,  for  each  i  and  j,  the  variance  of  WijYij  is  bounded,  therefore, 

Lx  =  Pntt-f^WjYij-Pua^Y—  6^} 

Tl  j—\  Tl  0 

=  0(e_c^in), 

where  c*Ll  is  a  positive  constant.  Next  we  analyze  L2  and  L3.  Similarly, 

Li  <  Pnmm  >  (48) 

+p-w  >  + «  a  } 

+Pn{\s/KiX<\  >  y + PMsfYm  >  ^-=^f } 

+P*{\MU?  -  iow)|  >  — 

71  Tl 

=  0(e“c^2n), 

and 

Ls  =  P.d'fc  E  -  A,fe  +  <'.,i)l  >  — «?}  (49) 

Tl  O 

=  0(e~cUn), 

where  CLl  and  c£3  are  positive  constants.  Hence,  by  the  above  argument,  if  we  set 
c*  =  mm^c^,  c*Ki,  c*Ll,  c*L2,  c*L3),  then  c*  >  0.  We  have  the  following  theorem. 

Theorem  2  The  selection  procedure  dn(w,y),  as  defined  in  (7),  is  asymptotically 
optimal  with  convergence  rate  of  order  0(e~c*n)  under  conditions  (32)  and  (38).  That 
is, 

E^L(§,  dn(  w,  y))  =  0(e~c*n),  (50) 

where  c*  >  0  is  defined  as  above.  We  consider  two  special  situations  next. 

Two  special  situations. 

1.  {(Xij,Uij,Cij),l  <  j  <  n}  are  normally  distributed.  In  this  case,  {(X^,  Uij,  e,j)} 
are  i.i.d.  X3((0, 0, 0), diag(crxxi,  auui,  <re£i)).  Since  (X?- / axxi ,  £/?■ / auui ,  e? •  / <rtd )  follow  x2 
distributions,  their  moment  generating  functions  exist.  By  Theorem  2,  the  selection 
procedure  dn( w,  y)  in  this  case  is  asymptotically  optimal  with  the  rate  of  convergence  of 
order  0(e-c*n). 

2.  {(Xy,  Uij,€ij),  1  <  j  <  n}  are  bounded.  Then  conditions  (32)  and  (38)  always 
hold  and  therefore,  the  selection  procedure  d„(w,y)  is  asymptotically  optimal  with  the 
convergence  rate  of  order  0(e~c’n). 
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4  Simulations 


We  carried  out  a  simulation  study  to  investigate  the  performance  of  the  selection  proce¬ 
dure  dn.  The  expected  risk  E^’^L(/3,  dn  (w,  y))  is  used  as  a  measure  of  the  performance 
of  the  selection  rule.  In  this  study,  we  considered  normal  distributions  and  there  are 
i  =  3  treatments.  The  simulation  scheme  is  described  as  follows: 

1.  For  each  j  =  1, . . . ,  n  and  i  =  1,2  and  3,  we  generated  independent  random  observa¬ 
tions  (Xij,  Uij,  eij)  from  multivariate  normal  iV3(( 0, 0, 0)r,  diag (axxi,  cruui,  aeei)). 

2.  Let  W ij  — —  X- fj  T  U(j  and  Fjj  —  $oi  ~f~  finXij  -1 - 

3.  Based  on  ( Wij,Yij ),  we  obtained  the  estimator  of  (3u,  then  made  the  selection  using 
dn  and  computed  D( W,  Y)  which  is  as  follows: 


c(w,Y)  =  {J; 


if  we  make  a  wrong  selection, 
if  we  make  a  correct  selection. 


(51) 


4.  Step  1,  2  and  3  were  repeated  10000  times.  With  each  set  of  observations,  D(W,  Y) 
would  be  either  0  or  1,  as  we  might  make  a  right  or  wrong  decision.  When  we  take  the 
sample  (w,  y)  repeatedly,  by  the  law  of  large  numbers,  the  average  of  D(W,  Y)  would 

be  getting  very  close  to  the  expected  risk  dn(w,y))  and  can  be  used  as  an 

estimator  of  the  expected  risk  when  the  number  of  iterations  is  large  enough. 


We  specified  the  number  of  iterations  to  be  10000  to  make  sure  that  the  deviation 
between  the  estimated  value  and  the  true  value  is  less  than  0.01  with  95%  confidence. 
The  following  is  a  brief  introduction  to  the  power  calculation  in  this  study.  We  are 
interested  in  the  unknown  probability  of  making  a  wrong  decision.  So  we  take  the 
sample  repeatedly  and  each  time  the  result  can  be  either  right  or  wrong.  Therefore,  we 
have  a  binomial  setting  here:  we  use  the  sample  proportion  (denoted  by  p)  to  estimate 
the  population  proportion  (denoted  by  p ).  When  the  number  of  iterations  (denoted  by 
N)  is  large  enough, 


P-P 

\/£¥i 


-  N( 0,1). 


With  95%  confidence,  \p  —  p\  <  2^/^IOT  When  N  =  10000,  since  p(l  —  p)  <  0.25, 


|^_p|<  J^)<2xJ^  =  0.01 


N 


10000 


This  is  the  reason  why  the  number  of  iterations  was  set  to  be  10000.  The  results  from 
the  simulation  study  are  listed  in  Table  1  for  the  case  where 


&xxl  —  &xx2  —  &xx3  —  1) 
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O’uul  &uu 2  ~~  &uu3  —  1) 

<^fel  =  <7«2  =  CT£e3  =  1, 

A)1  =  A>2  =  A>3  =  0, 
fill  =  0.4,  y0i2  =  0.5,  /?!3  =  —0.6. 


and 


n  =  10,20,30,40,50,60,70,80,90,100,200,300,400,500,  (52) 

600, 700, 800, 900, 1000, 1100, 1200, 1500, 2000. 

The  curve  of  the  estimated  probability  of  making  a  wrong  decision  with  respect  to 
n  is  attached  in  Figure  1  at  the  end  of  this  paper.  It  bears  out  our  conclusions  that  the 
rate  of  convergence  of  the  probability  of  making  a  wrong  decision  should  be  0(e-c*n)  in 
this  case. 
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Table  1 


n 

Dn 

10 

0.5443 

20 

0.4851 

30 

0.4425 

40 

0.3926 

50 

0.3667 

60 

0.3578 

70 

0.3334 

80 

0.3219 

90 

0.2862 

100 

0.2743 

200 

0.1627 

300 

0.1103 

400 

0.0684 

500 

0.0488 

600 

0.0315 

700 

0.0271 

800 

0.0246 

900 

0.0163 

1000 

0.0139 

1100 

0.0089 

1200 

0.0064 

1500 

0.0035 

2000 

0.0004 
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